Compressed Indexes for Fast Search of Semantic Data
نویسندگان
چکیده
The sheer increase in volume of RDF data demands efficient solutions for the triple indexing problem, that is to devise a compressed structure compactly represent triples by guaranteeing, at same time, fast pattern matching operations. This problem lies heart delivering good practical performance resolution complex SPARQL queries on large datasets. In this work, we propose trie-based index layout solve and introduce two novel techniques reduce its space representation improved effectiveness. extensive experimental analysis, conducted over wide range publicly available real-world datasets, reveals our best space/time trade-off configuration substantially outperforms existing state-of-the-art, taking 30-60 percent less speeding up query execution factor 2 - 81×.
منابع مشابه
Compressed Text Indexes with Fast Locate
Compressed text (self-)indexes have matured up to a point where they can replace a text by a data structure that requires less space and, in addition to giving access to arbitrary text passages, support indexed text searches. At this point those indexes are competitive with traditional text indexes (which are very large) for counting the number of occurrences of a pattern in the text. Yet, they...
متن کاملCompressed Inverted Indexes for In-Memory Search Engines
We present the algorithmic core of a full text data base that allows fast Boolean queries, phrase queries, and document reporting using less space than the input text. The system uses a carefully choreographed combination of classical data compression techniques and inverted index based search data structures. It outperforms suffix array based techniques for all the above operations for real wo...
متن کاملFast In-Memory XPath Search over Compressed Text and Tree Indexes
A large fraction of an XML document typically consists of text data. The XPath query language allows text search via the equal, contains, and starts-with predicates. Such predicates can efficiently be implemented using a compressed self-index of the document’s text nodes. Most queries, however, contain some parts of querying the text of the document, plus some parts of querying the tree structu...
متن کاملFast Compressed Self-Indexes with Deterministic Linear-Time Construction
We introduce a compressed suffix array representation that, on a text T of length n over an alphabet of size σ, can be built in O(n) deterministic time, within O(n log σ) bits of working space, and counts the number of occurrences of any pattern P in T in time O(|P |+log logw σ) on a RAM machine of w = Ω(logn)-bit words. This new index outperforms all the other compressed indexes that can be bu...
متن کاملAdaptive search area for fast motion estimation
In this paper a new method for determining the search area for motion estimation algorithm based on block matching is suggested. In the proposed method the search area is adaptively found for each block of a frame. This search area is similar to that of the full search (FS) algorithm but smaller for most blocks of a frame. Therefore, the proposed algorithm is analogous to FS in terms of reg...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering
سال: 2021
ISSN: ['1558-2191', '1041-4347', '2326-3865']
DOI: https://doi.org/10.1109/tkde.2020.2966609